Elizabeth Bekele, Alison Cheek
2022-05-03
#This will allow us to filter through our data
library(tidyverse)
library(dplyr)
#This will help us plot figures to showcase our findings
library(ggplot2)
#This will help us organize and display our data as necessary
library(knitr)
library(kableExtra)
#This expands our plot uses
library(plotly)
#Scientific Notation Disabled
options(scipen=999)Import the deaths-due-to-air-pollution data
## Rows: 6,468
## Columns: 7
## $ Entity <chr> "Afghanistan", "Afghan…
## $ Code <chr> "AFG", "AFG", "AFG", "…
## $ Year <int> 1990, 1991, 1992, 1993…
## $ Air.pollution..total...deaths.per.100.000. <dbl> 299.4773, 291.2780, 27…
## $ Indoor.air.pollution..deaths.per.100.000. <dbl> 250.3629, 242.5751, 23…
## $ Outdoor.particulate.matter..deaths.per.100.000. <dbl> 46.44659, 46.03384, 44…
## $ Outdoor.ozone.pollution..deaths.per.100.000. <dbl> 5.616442, 5.603960, 5.…
We are going to rename a few of the columns and glimpse the data
colnames(deaths_df) <- c("country", "acronym", "year", "total_deaths", "indoor_deaths", "outdoor_deaths", "ozone_deaths")
glimpse(deaths_df)## Rows: 6,468
## Columns: 7
## $ country <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanist…
## $ acronym <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",…
## $ year <int> 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1…
## $ total_deaths <dbl> 299.4773, 291.2780, 278.9631, 278.7908, 287.1629, 288.0…
## $ indoor_deaths <dbl> 250.3629, 242.5751, 232.0439, 231.6481, 238.8372, 239.9…
## $ outdoor_deaths <dbl> 46.44659, 46.03384, 44.24377, 44.44015, 45.59433, 45.36…
## $ ozone_deaths <dbl> 5.616442, 5.603960, 5.611822, 5.655266, 5.718922, 5.739…
Variables that interest us here include:
Now, let’s take a look at the population data.
## Rows: 12,595
## Columns: 3
## $ Country.Name <chr> "Aruba", "Afghanistan", "Angola", "Albania", "Andorra", "…
## $ Year <int> 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 196…
## $ Count <int> 54211, 8996973, 5454933, 1608800, 13411, 92418, 20481779,…
To get a general idea of ‘deaths-dataframe’ we made, let’s make a plots to see what’s happening. This is a plot of indoor x outdoor deaths around the world by country.
This is a mess, and so we chose two countries from each continent (a high-population and a low-population country) to graph.
We selected a high population from each continent and used the formula below to determine the low population.
Low population = high population * .10
|
|
|
|
|
|
|
|
Which country has the highest average death count?
Let’s make a table depicting the high and low populated countries and their respected death count due to pollution.
|
|
Here’s a graph to clearly visualize the previous table
So we’ve looked at the deaths due to pollution, but what percentage of the population was affected?
|
|
Which type of pollution has the greatest number of deaths?
| country | avg_indoor | avg_outdoor | avg_ozone |
|---|---|---|---|
| Pakistan | 87.7427944 | 50.52063 | 10.440656 |
| Nigeria | 75.8755074 | 35.21678 | 2.117076 |
| Brazil | 19.4258385 | 26.84194 | 2.740342 |
| Germany | 0.7170881 | 25.47078 | 2.343892 |
| Australia | 0.2485867 | 17.20789 | 0.360452 |
| United States | 0.1656402 | 22.79947 | 3.915093 |
| country | avg_indoor | avg_outdoor | avg_ozone |
|---|---|---|---|
| Canada | 0.0651156 | 16.38423 | 1.9697041 |
| Chile | 8.6932699 | 27.17442 | 0.8504919 |
| Malawi | 132.1891749 | 13.81151 | 3.3870514 |
| New Zealand | 0.2908622 | 15.56872 | 0.0727512 |
| Serbia | 35.8762796 | 42.71254 | 2.9395671 |
| Sri Lanka | 44.5428441 | 24.77233 | 0.4304406 |
This is the first decade 1996-2006Let’s look at the previous two decades and compare the death count Has there been a change?
|
|
|
|
Let’s graph the previous tables!
The first decade.
This shows the second decade.
By comparing each pollutant type, we can determine which year and country had the highest numbers of deaths
Indoor Deaths
Outdoor Deaths
Ozone Deaths
outdoor or indoor pollution?
Let’s reintroduce a graph we looked at earlier. Instead this time we will combine the pollutant types together.
We cannot conclude which is worse.
[https://www.kaggle.com/datasets/akshat0giri/death-due-to-air-pollution-19902017 ]
[https://www.epa.gov/ground-level-ozone-pollution/ground-level-ozone-basics]
[https://www.health.nsw.gov.au/environment/air/Pages/outdoor-air-pollution.aspx]
[https://www.kaggle.com/datasets/imdevskp/world-population-19602018]